Goto

Collaborating Authors

 block size




SU(2) = R(ฮธ, ฮธ, ฯ‰) = tkje P0 tkje T0 gkjt 0 ejฯ‰Wkjt 0 ejฮธL ฮธ! jฮธgsin 2 ฯ‰cos 2 ฯ‰ej 2 0 = e cos

Neural Information Processing Systems

A.1 Mach-Zehnder Interferometers (MZIs) A basic coherent optical component used in this work is an MZI. One of the most general MZI structures is shown in Figure 15, consisting of two 50-by-50 optical directional couplers and four phase shifters ฮธ, ฮธ, ฯ‰, and ฯ‰. An MZI can achieve arbitrary 2 2 unitary matrices SU(2). Figure 15: 2-by-2 MZI with top (T), left (L), upper (P), and lower (W) phase shifters. A.2 MZI-based Photonic Tensor Core Architecture By cascading N(N 1)/2MZIs into a triangular mesh (Recks-style) or rectangular mesh (Clementsstyle), we can construct arbitrary N N unitary U(N).


DOS: Dependency-Oriented Sampler for Masked Diffusion Language Models

arXiv.org Machine Learning

Masked diffusion language models (MDLMs) have recently emerged as a new paradigm in language modeling, offering flexible generation dynamics and enabling efficient parallel decoding. However, existing decoding strategies for pre-trained MDLMs predominantly rely on token-level uncertainty criteria, while largely overlooking sequence-level information and inter-token dependencies. To address this limitation, we propose Dependency-Oriented Sampler (DOS), a training-free decoding strategy that leverages inter-token dependencies to inform token updates during generation. Specifically, DOS exploits attention matrices from transformer blocks to approximate inter-token dependencies, emphasizing information from unmasked tokens when updating masked positions. Empirical results demonstrate that DOS consistently achieves superior performance on both code generation and mathematical reasoning tasks. Moreover, DOS can be seamlessly integrated with existing parallel sampling methods, leading to improved generation efficiency without sacrificing generation quality.



PrivCirNet: Efficient Private Inference via Block Circulant Transformation

Neural Information Processing Systems

Homomorphic encryption (HE)-based deep neural network (DNN) inference protects data and model privacy but suffers from significant computation overhead. We observe transforming the DNN weights into circulant matrices converts general matrix-vector multiplications into HE-friendly 1-dimensional convolutions, drastically reducing the HE computation cost.




Blockwise Parallel Decoding for Deep Autoregressive Models

Neural Information Processing Systems

To overcome this limitation, we propose a novel blockwise parallel decoding scheme in which we makepredictions for multiple time steps inparallel then back offtothe longest prefix validated byascoring model.